CULMINATING PATHS 



MIREILLE BOUSQUET-MELOU AND YANN PONTY 



Abstract. Let a and b be two positive integers. A culminating path is a path of 1? that 
starts from (0,0), consists of steps (1, a) and (1, — b), stays above the x-axis and ends at the 
highest ordinate it ever reaches. These paths were first encountered in bioinformatics, in the 
analysis of similarity search algorithms. They are also related to certain models of Lorentzian 
gravity in theoretical physics. 

We first show that the language on a two letter alphabet that naturally encodes culminating 
paths is not context-free. 

Then, we focus on the enumeration of culminating paths. A step by step approach, com- 
bined with the kernel method, provides a closed form expression for the generating function 
of culminating paths ending at a (generic) height k. In the case a = b, we derive from this 
expression the asymptotic behaviour of the number of culminating paths of length n. When 
a > 6, we obtain the asymptotic behaviour by a simpler argument. When a < fe, we only 
determine the exponential growth of the number of culminating paths. 

Finally, we study the uniform random generation of culminating paths via various methods. 
The rejection approach, coupled with a symmetry argument, gives an algorithm that is linear 
when a>b, with no precomputation stage nor non-linear storage required. The choice of the 
best algorithm is not as clear when a < b. An elementary recursive approach yields a linear 
algorithm after a precomputation stage involving 0{n^) arithmetic operations, but we also 
present some alternatives that may be more efficient in practice. 



1. Introduction 

One-dimensional lattice walks on Z have been extensively studied over the past 50 years. 
These walks usually start from the point 0, and take their steps in a prescribed finite set 5 C Z. A 
large number of results are now known on the enumeration of sub-families of these walks, and can 
be obtained in a systematic way once the set S is given. This includes the enumeration of bridges 
(walks ending at 0), meanders (walks that always remain at a non-negative level), excursions 
(meanders ending at level 0), excursions of bounded height, and so on. In particular, the nature 
of the associated generating functions is well understood: these series are always algebraic, and 
even rational for bounded walks [2l[5l[T0l[8l[19l[26l[3Tl[32l[37]. These algebraicity properties 
actually reflect the fact that the languages on the alphabet S that naturally encode these families 
of walks are context-free, and even regular in the bounded case. In many papers, these one- 
dimensional walks are actually described as directed two-dimensional (2D) walks, upon replacing 
the starting point by (0, 0) and every step s by (1, s). This explains why excursions are often 
called generalized Dyck paths (the authentic Dyck paths correspond to the case S ~ {1,-1}). 
This two-dimensional setting allows for a further generalisation, with steps of the form (i, j), 
with i > and j € Z, but this does not affect the nature of the associated languages and 
generating functions. The uniform random generation of these walks has also been investigated, 
through a recursive approach [39 t [24 } [20] or using an anticipated rejection [6j I33j. 

This paper deals with a new class of walks which has recently occurred in two independent 
contexts, and seems to have a more complicated structure than the above mentioned classes: 
culminating walks. A 2D directed walk is said to be culminating if each step ends at a positive 
level, and the flnal step ends at the highest level ever reached by the walk (Figured]). We focus 
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Figure 1. A culminating path (for a = 5 and 5 = 3) and the corresponding word. 

here on the case where the steps are (l,a) and (1,-6), with a and b positive, hoping that this 
encapsulates all the possible typical behaviours. 

In the case a = 6 = 1, culminating walks have recently been shown to be in bijection with 
certain Lorentzian triangulations [18], a class of combinatorial objects studied in theoretical 
physics as a model of discrete two-dimensional Lorentzian gravity. Using a transfer matrix 
approach, the authors derived the generating function for this case. We give two shorter proofs 
of their result. Also, while it is not clear how the method used in [l^ could be extended to the 
general (a, 6)-case, one of our approaches works for arbitrary values of a and b. 

The general (a, 6)-case appears in bioinformatics in the study of the sensitivity of heuristic 
homology search algorithms, such as BLAST, FASTA or FLASH [HlMlin]- These algorithms 
aim at finding the most conserved regions {similarities) between two genomic sequences (DNA, 
RNA, proteins...) while allowing certain alterations in the entries of the sequences. In order 
to avoid the supposedly intrinsic quadratic complexity of the deterministic algorithms, these 
heuristic algorithms first consider identical regions of bounded size and extend them in both 
directions, updating the score with a bonus for a match or a penalty for an alteration, until the 
score drops below a certain threshold. The evolution of the score all the way through the final 
alignment turns out to be encoded by a culminating walk. 

In [30], we first studied the probability of a culminating walk to contain certain patterns 
called seeds, as some recent algorithms make use of them to relax the mandatory conservation of 
small anchoring portions. Then, we proposed a variant of the recursive approach for the random 
generation of these walks. Finally, we observed that the naive rejection-based algorithm, which 
consists in drawing uniformly at random up and down steps and rejecting the resulting walk if 
is not culminating, seemed to be linear (resp. exponential) when a > b (resp. a < b). This 
observation, which is closely related to the asymptotic enumeration of culminating walks, is 
confirmed below in Section [621 

To conclude this introduction, let us fix the notation and summarize the contents of this paper. 
Let a and b be two positive integers. A walk (or path) of length n is a sequence (0, 770), ... , (n, r]n) 
such that ?7o = and ?/;+! — iji e {a, —6} for all i. The height of the walk is the largest of the 
rji's, while the final height is rjn. The walk is culminating if the two following conditions hold: 

Vi e [1,71], rji > (Positivity), 

Vi e [0, n — 1], rji < ?]„ (Final record). 

See Figures [T] and [2] for examples and counter-examples. We encode every walk by a word on 
the alphabet {m,m} in a standard way: each ascending step (l,a) is replaced by a letter m 
and each descending step (1,-6) is replaced by a letter m. We denote by {m,m}* the set of 
words on the alphabet {m,™}. From now on, we identify a path and the corresponding word. 
Since these objects are essentially one-dimensional, we will often use a ID vocabulary, saying, 
for instance, that our paths take steps +a and —b (rather than (1, a) and (1, —b)). We hope that 
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this will not cause any confusion. Without loss of generality, we restrict our study to the case 
where a and b are coprime. 

For any word w, we denote by \w\m (resp. \w\rn) the number of occurrences of the letter m 
(resp. m) that it contains. We denote by 1^1 the length of w. The function (j)a,b ■ {m,rn}* — > N 
maps a word to the final height of the corresponding walk. That is, (j)a,b{w) = a\w\m — b\w\jn. 
The culmination properties can be translated into the following language-theoretic definition: 

Definition 1.1. The language o/ culminating words is the setC""'^ C {m,rfl}* of words w such 
that, for every non-empty prefix w' of w: 

4'a,biw') > (Positivity), 

and, for every proper prefix w' of w: 

4>a.,b{w') < (t'a,biw) (Final record). 

The main result of Section [2] is that the language C°'^ is not context-free. In Section [3l 
we obtain a closed form expression for the generating function of culminating walks. This 
expression is complicated, but we believe this only reflects the complexity of this class of walks. 
This enumerative section is closely related to the recent work [10], devoted to a general study 
of excursions confined in a strip. In particular, symmetric functions play a slightly surprising 
role in the proof and statement of our results. We then derive in Section H] the asymptotic 
number of culminating walks, in the case a > b. Our result implies that, asymptotically, a 
positive fraction of (general) (a, 6)-walks are culminating if a > 6. We prove that this fraction 
tends to exponentially fast if a < 6. More precisely, we determine the exponential growth 
of the number of culminating walks. This asymptotic section uses the results obtained in [5] 
on the exact and asymptotic enumeration of excursions and meanders. Finally, in Section [6l 
we present several algorithms for generating uniformly at random culminating walks of a given 
length. Our best algorithms are linear when a> b. When a < b, the choice of the best algorithm 
is not obvious. An elementary recursive approach yields a quasi-linear generating stage but 
requires the precomputation and storage of 0(n'^) numbers. We exploit in this section several 
generation schemes, like the recursive method [39tl24j. the rejection method [l4j and Boltzmann 
samplers [20]. Moreover, we address in Section [5] the random generation of positive walks, 
which is a preliminary step in some of our algorithms generating culminating walks. We have 
implemented our algorithms in Java, and we invite the reader to generate his/her own paths at 
the address http : //www . Iri . f r/~ponty/walks[ Figure [3] shows random culminating paths of 



length 1000 generated with our software, for various values of a and b. 

2. Language theoretic properties 

We denote by C°''^^^ the subset of C°'^ that consists of the walks (words) ending at height k. 
It will be easily seen that this language (for a fixed k) is regular. However, we shall prove that 
the full language C''^ is not context-free. We refer to [27] for definitions on languages. 
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Figure 3. Random culminating paths of size 1000, when (a, 6) = (1, 1), (a, b) = 
(2,1), (a, 6) ~ (1,2). In the first two cases, four paths are displayed, while for 
the sake of clarity, only one path is shown in the third case. 



2.1. Culminating walks of bounded height 

Proposition 2.1. For all a, 6, k e N, the language C"-'^^'' of culminating words ending at height 
k is regular. 

Proof. The culminating paths of final height k move inside a bounded space. This allows us to 
construct a (deterministic) finite-state automaton that recognizes these paths. The states of this 
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automaton are the accessible heights (that is, 0, 1, . . . , fc), plus a garbage state _L. The initial 
state is 0, the final state is k, and the transition function S is given, for < q < k, by: 

J + a if q < k — a, \ q — b if q > b, 

_L otherwise 



S{q,m) = 



6{q,m) = 



otherwise 



while 



S{k,-)^Si±,-)^±. 

Clearly, this automaton sends any word attempting to walk below (resp. above k) in the 
garbage _L, where it will stay forever and therefore be rejected. Moreover, it only accepts those 
words ending in the state fc. Hence this automaton recognizes exactly C"'''^'^. Since the state 
space is finite, is a regular language. ■ 

2.2. Unbounded culminating walks 

Proposition 2.2. For all a,b gN, the language C^'^ of culminating walks is not context-free. 

Proof. Recall that the intersection of a context-free language and a regular language is context- 
free [27]. Let C be the following regular language: £ = m* .Wi* .m* . It can be seen as the language 
of "zig-zag" paths. Let K. = C""'' n £. It is easy to see that 

K. = {m^ .mP .ra^ \ i > 0, bj < ai and bj < ak}. 

Assume that C^'^ is context-free. Then so is /C, and, by the pumping lemma for context-free 
languages [27l Theorem 4.7], there exists n g N such that any word w G /C of length at least n 
admits a factorisation w = x.u.y.v.z satisfying the following properties: 

ii) \u.v\ > 1, 

(ii) \u.y.v\ < n, 

{Hi) \/i > 0, Wi := x.u^.y.v^.z G /C. 
Since a and b are coprime, there exist i > n and j > n such that ia — jb = 1 (this is the 
Bachet-Bezout theorem). Hence the word w = m^m^m^ belongs to /C. In the rest of the proof, 
we will refer to the first sequence of ascending steps of w as A, to the descending sequence as B 
and to the second ascending sequence as C. 



Where is the factor u.y.vl 


e 




Failing condition 


A 







Pos.: (I){m'-'\rn^) ^ 1 - ah < 


B 


2 




Pos.: (j){m\rn^+'') ^ 1 ~ bh < 


C 







Fin. rec: 4>{wi) = (f>{m^) — ah < (j){m'-) 


AUB 

\u\rn-\u\m + \v\rn.\v\„i ^ 


2 


rn^ .m^ .rn'^' .mP .to' 


wi ^ C (Too many peaks) 


u = TO*'' , V = rfi^ 


2 




Final record: 

(jiiwii) = 0(to*+'') + l-bk' < (j){m'+'') 


BUC 

\u\rn.\u\rn + \v\m:-\v\„i ^ 


2 


rn^ .mP .rn}' .mf' .mP 


wi ^ C (Too many valleys) 


u = , V = m'' 


2 




Pos.: (I){m\m^+^) = 1 - fc6 < 



Table 1. Why the pumping lemma is not satisfied. 



In Table [U we consider all eligible factorisations of w of the form w = x.u.y.v.z. Five cases 
arise, depending on which part of w contains the factor u.y.v. Condition {ii) implies that this 
factor cannot overlap simultaneously with the parts A and C . Each of the cases A\JB and BUC 
is further subdivided into two cases, depending on whether u and v are monotone or not. 



6 



MIREILLE BOUSQUET-MELOU AND YANN PONTY 



For each factorisation, the table gives a value of I for which the word wi does not belong to 
/C. This is justified in the rightmost column: either wi does not belong to the set £ of zig-zag 
paths, or the positivity condition does not hold, or the last step of the walk is not a record. 

Once all the possible factorisations have been investigated and found not to satisfy the pump- 
ing lemma, we conclude that the languages K. and C'*' are not context-free. ■ 



3. Exact enumerative results 

In this section, we give a closed form expression for the generating function of (a, h)- 
culminating walks. More precisely, we give an expression for the series counting culminating 
walks of height k, and then sum over k. This summation makes the series a bit difficult to han- 
dle, for instance to extract the asymptotic behaviour of the coefficients (Section H]) . We believe 
that this complexity is inherent to the problem. In particular, we prove that the generating 
function of (1, l)-culminating walks is not only transcendental, but also not D-finite. That is, it 
does not satisfy any Hnear differential equation with polynomial coefficients [371 Ch. 6]. 

3.1. Statement of the results and discussion 

Let us first state our results in the (l,l)-case and then explain what form they take in the 
general (a, 6)-case. 

Proposition 3.1. Let a = b = 1 and k > 1. The length generating function of culminating 
paths of height k is 

where 



• Fk is the kth Fibonacci polynomial, defined by Fq = Fi = I and F^ = Fk-i — t'^Fk-2 for 
k > 2, 

• Ui and U2 are the two roots of the polynomial u — t(l -t- v?): 



2t ' 

• U stands for any of the Ui 's. 

The generating function of culminating walks, 

1 tt2 Tjk 

fe>i 

is not D-finite. 

The above expression of C{t) is equivalent to the case x ^ y = \ oi [HI Eq.(2.26)]. 

The first expression of C^, in terms of the Fibonacci polynomials, is clearly rational. As 
explained in Section 12. H the language of culminating walks of height k is regular for all a and 6, 
so that the series Ck will always be rational. Of course, Ck is simply when k < a. When k = a, 
there is only one culminating path, reduced to one up step, so that Ck = t. More generally, the 
following property, illustrated in Figure H] and proved in Section [3.2.11 holds. 

Property 3.2. For k < a + b, there is at most one culminating path of height k. 

As soon as fc > a, culminating walks of height k have at least two steps. Deleting the first 
and last ones gives Ck = f^Wk, where Wk counts walks (with steps +a, ~b) going from a to fc — a 
on the segment |1, fc — Ij. General (and basic) results on the enumeration of walks on a digraph 
provide [36l Ch. 4]: 

Ck^t^Wk^tmi-tAkn^^.^^^t^^, (2) 



k 
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Figure 4. When a = 5 and 6 = 3, there is no culminating walk of height k, for 
k e Jl, 8]] \ {5, 7, 8}. For fc = 5, 7, 8, there is exactly one culminating walk. 



where = (Aij)i<ij<k-i is the adjacency matrix of our segment graph: 

'■' V otherwise, ^ ' 

Dk is the determinant of (1 — tA^) and N^/Dk is the entry (a, fc — a) of (1 — tAk)^^. 

We note from Proposition 13.11 that, in the (1, l)-case, both and Dk are especially simple. 
Indeed, Nk = t'^"^, while Dk = Fk^i satisfies a Hnear recurrence relation (with constant coeffi- 
cients) of order 2. We will prove that, for all a and 6, both sequences Nk and Dk satisfy such a 
recurrence relation (of a larger order in general) . The monomial form of Nk will hold as soon as 
a = 1. 

The second expression of Ck given in Proposition 13.11 appears as a rational function of the 
roots of the polynomial u — t{l + v?). Even though both series Ui and U2 are algebraic (and 
irrational), the fact that Ck is symmetric in Ui and U2 explains why Ck itself is rational. In 
general, we will write Ck as a symmetric rational function of the a + b roots of the polynomial 



t{l + M'^+*'), denoted Ui,...,Ua- 



-b- 



The third expression of Ck follows from the fact that U1U2 = 1. In general, t = U''/{1 + [/"+'') 
for U = Ui, so that it will always be possible to write Ck as a rational function of U. However, 
this expression will not be always as simple as above. The equivalence of the three expressions 
of Proposition 13.11 follows easily from the fact that 

I _ jj2k+2 



Fk = 



(l-t/2)(l + C/2)* 



This can be proved by solving the recurrence relation satisfied by the F^'s — or can be checked 
by induction on k. 



Let us now state our generalisation of Proposition 13.11 to (a, 6)-culminating walks. Our first 
expression of Ck, namely the rational form involves the evaluation of two determinants 
of size (approximately) k. Our second expression of Ck will be a fixed rational function of 
Ui, . . . , Ua+b, Ui, . . . , Ul^_^_i,, symmetric in the Ui, which involves two determinants of constant 
size a + b. The existence of such smaller determinantal forms for walks confined in a strip 
has already been recognized in [Sj Ch. 1]. More recently, the case of excursions confined in a 
strip has been simplified and worked out in greater detail [TO]. As in [TO], our results will be 
expressed in terms of the Schur functions sx, which form one of the most important bases of 
symmetric functions in n variables xi, . . . ,x„: for any integer partition A with at most n parts, 
A = (Ai, . . . , A„) with Ai > A2 > • • • > A„ > 0, 

s,{X) = (4) 
as 

with X = {xi, . ..,Xn), (5 = (n - l,n - 2, . . . ,1,0) and = det (a;r')i<i j<„ ■ We refer to [37l 
Ch. 7] for generalities on symmetric functions. 
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Proposition 3.3. Let k > a. With the above notation, the length generating function of (a, 6)- 
culminating paths of height k admits the following expressions: 



Ct,{t)=e{{l-tAk)-') 



a.k—a 



t 



where Au is given by the (a + b)-tuple U = [Ui,. 



, Ua+b) is the collection of roots of 
the polynomial — t{l + and the partitions A and ^ are given by X = (fc — 1)" and 

M= ((fc-ir-i,a-l). 

The determinant Dk of {1 — tAk) and the relevant cof actor Nk are respectively given by 

Dk = {-l)'^''-^^^''-^h''-hx(U) and Nk = {-l)'^''-'^^''-^h''-^ s^{U). (5) 



Both sequences Nk and Dk satisfy a linear recurrence relation with coefficients in 



tively of order 



/a+b\ 



and 



respec- 



These orders are optimal. 



Note that the expression of Ck in terms of Schur functions still holds for k = a. Examples will 
be given below. For the moment, let us underHne that the case a = 1 of this proposition takes 
a remarkably simple form, which will be given a combinatorial explanation in Section [3.2.31 



Corollary 3.4. When a = 1, the generating 

Ckit) 



1!L 

Dk 



of culminating walks of height k > 1 reads 
t 



hk-i{uy 

where hi is the complete homogeneous symmetric function of degree i, Dk = 1 for 1 < k < b + 1 
and Dk = Dk-i - t^+'^Dk-b-i fork>b+l. 



Examples. Let us illustrate Proposition 13.31 bv writing down expHcitly the expression of Ck for 
a few values of a and b. We use the determinantal form ^ of Schur functions. 

Case a = b = 1. Here Ui and U2 are the two roots of the polynomial u — t{l + u'^). The partition 
/J, is empty, so that = 1, while A = (fc — 1). This gives 

■ Ui 1 
U2 1 



Ck 



t 



t 



U1-U2 



as in Proposition 13.11 The recurrence relations satisfied by the polynomials A^^ and Dk can 
always be worked out from their expressions (O, as will be explained in Section [3.2.21 In the 

case a = = 1, one finds 



Nk 



t 



k-2 



and Dk = Dk-i - t Dk-2, 



Ck - t'Nk/Dk with 
with initial conditions Di = D2 = 1. 

Case a = 1, 6 = 2. Here Ui, U2, U3 are the three roots of the polynomial 
fj. is empty and A = (fc — 1) (this holds as soon as a = 1). One obtains 



Ck=t 



t{l + u^). Again, 



u? 


Ui 


1 




U2 


1 


Ui 


U3 


1 


^k+1 


Ui 


1 


ut' 


U2 


1 


ut' 


U3 


1 



The rational expression of Ck reads 



Ck^t'Nk/Dk with 7Vfe=t'=-2 and Dk=Dk-i-t'D 



k-3, 



with initial conditions Di = D2 = D3 = \. Note that this expression allows us to compute in a 
few seconds the number c„ of culminating walks for n up to 500. 
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Case a = 2, 6 = 1. Here Ui, U2, U3 are the three roots of the polynomial u — t{l + u'^). One has 
= (fc — 1, 1) and A = (fc — 1)^, which gives: 

^k+i jjk-i ^ 

U^+^ U^-^ 1 
U^-^ 1 

^ [7^+1 C7i 1 ' 
[72^+^ C/s 1 
[/3+^ C/3 1 

where Ui := 1/?/^. Note that the series Ui are the roots of the polynomial v? — til + v!^), which 
occurs in the (symmetric) case a = 1,6 = 2. It is actually clear from ^ that the denominator 
Dk is unchanged when exchanging a and b. 
The rational expression of Ck reads 

Ck = t^Nk/Dk with Nk=tNk-2 + t^Nk-3 and Dk = D^-i ~ Dk-^, 
with initial conditions TVi = 0, = iVg = t and Di = Da = £>3 = 1. 

3.2. Proofs 

3.2.1. Proof of Propertv \3.Si Let us say that a path is positive if every step ends at a positive 
level. For instance, culminating walks are positive. For n > there exists a unique positive walk 
of length n and height at most a + 6, denoted w„. Indeed, given h € |0, a + 6], exactly one of 
the values h + a,h — b lies in the interval [1, a + &|. For the same reason, Wi is a prefix of wj for 
* < j- Let k < a + b, and assume that there exist two distinct culminating walks of height fc. 
These walks must be Wi and Wj, for some i and j, with, say, i < j. But then w; is a prefix of 
Wj, and ends at height fc, which prevents wj from being culminating. g 

3.2.2. Proof of Proposition \3.3l The expression of Ck in terms of the adjacency matrix 
Ak has been justified in Section 13.11 Let us now derive the Schur function expression of this 
series. We will give actually two proofs of this expression: the first one is based on the kernel 
method jHHHIS], and the second one on the Jacobi-Trudi identity. The first proof is completely 
elementary. The second one allows us to relate the polynomials Nk and Dk to the Schur functions 
sx and s^. This derivation is very close to what was done in [lO] for excursions confined in a 
strip. Some of the results of [10] will actually be used to shorten some arguments. 

First proof via the kernel method. Consider a culminating walk of height k > a. Such a 
walk has length at least 2. Delete its first and last steps: this gives a walk starting from level 
a, ending at level fc — a, and confined between levels 1 and fc — 1. Shifting this walk one step 
down, we obtain a non-negative walk starting from level a — 1 and ending at level fc — 1 — a, of 
height at most fc — 2. Let G(t, u) = G{u) denote the generating function of non-negative walks 
starting from a — 1, of height at most fc — 2. In this series, the variable t keeps track of the length 
while the variable u records the final height. Write G{u) = J2h=o ^'^Gh, where Gh counts walks 
ending at height h. The above argument implies that the generating function of culminating 
walks of height fc is 

Ck = t^Gk-a-l- (6) 

We can construct the walks counted by G[u) step by step, starting from height a — 1, and adding 
at each time a step +a (unless the current height is fc — a — 1 or more) or —b (unless the current 
height is 6 — 1 or less). In terms of generating functions, this gives: 

b-l fc-2 

G{u)^u''-^ +t{u'' + u-^)G{u)-tu-''^v!'Gh~tu'' ^ u^Gh, 

h—0 h—k—a—1 



u^+' 




1 


ut' 


ui 


1 






1 







1 


ut' 




1 


ut' 




1 
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that is, 



6-1 



fc-2 



h=0 



h—k—a— 1 



The kernel of this equation, that is, the polynomial v!' — t{l + u"'^^), has a + b distinct roots, 
which are Puiseux series in t. We denote them C/i, . . . , Ua+b- Recall that G[u) is a polynomial in 
u (of degree k — 2). Replacing u by each of the Ui gives a system of a + 6 linear equations relating 
the unknown series Go, ... , Gb-i and Gk-a-i, ■ ■ ■ , Gk-2- For U — Ui, with 1 < i < a + b, 



6-1 

E 

h=0 



U^Gh + U' 



a+b 



k-2 

E 

h—k—a— 1 



In matrix form, we have M.Q =C/t, where M. is the square matrix of size 



a+b+k-2 



M = 



a+b+k-2 



UI 



a+b+k-3 



UI 



b+k-l 



u'i~^ ul 



6-2 



U: 



\ jja+b+k-2 
\ ^a+b 



h given by 

M 
1 



TTa+b+k—3 
^a+b 



U 



b+k-1 
b 



u 

, Go 



6-1 
6 



U 



6-2 
a+b 



(7) 



1 / 



and C is the column vector 



I ^^+6 ■'■^ view of the definition ^ of Schur functions. 



G is the column vector {Gk-2, 

^Jja+b-l^ 

dct{M) = sx{U), 

with A = (fc — l)''. It has been shown in [10] that the generating function of excursions (walks 
starting and ending at 0) confined in the strip of height fc — 2 is 

t S(fe_l)a(Z^)' 

and that, in particular, s\{U) ^ 0. Hence Ai is invertible, and applying Cramer's rule to the 
above system gives 



G 



1 s^iU) 



tsx{uy 

with A and ^ defined as in the statement of the proposition. Combining this with ([6]) gives the 
desired Schur function form of Ck ■ 

A second proof via symmetric functions. Let us now give an alternative proof of the Schur 
function expression of Ck- It will be based on the dual Jacobi-Trudi identity, which expresses 
Schur functions as a determinant in the elementary symmetric functions |37[ Cor. 7.16.2]: for 
any partition v, 

Sj, = det (e^,'+j_j ) , (8) 

where v' is the conjugate partition of v. 

Let us consider the identity (JH), with Dk = det(l — tAk). It turns out that this determinant 
is of the form Indeed, let us define Vi = —Ui, iov 1 < i < a + b. Then the only elementary 
symmetric functions of the Vi that do not vanish are eo(V) = 1, ea(V) = —1/t and ea+b{y) ~ 1 
(with V = {Vi, . . . , Va+b))- Let us apply ^ to v ~ \ = [k — 1)°, with variables Vi, . . . , Va+b- 



Then u' 



and one obtains 



since sx is homogeneous of degree a{k — 1). This gives the Schur function expression of Dk- 

Now, by the general inversion formula for matrices, Nk = (— l)'^det((l — tAk)''^"''"'), where 
(1 — tAfe is obtained by deleting row k — a and column a from (1 — tAk). Let us apply ^ 



tov = ^ = ((/c- l)'^-i,a- 1). Then v' 



k — 



The matrix 



(e,.+^ 



has size k—\. 
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and its last column contains only one non-zero entry (equal to eo(V) = 1), in row k — a. After 
deleting this row and the last column, one obtains: 

s^(V) = (-l)"-i(-t)"(''^"2) dct((l - tAk)''-''^ = (-l)"-^t-('-'^iVfc = 

as is homogeneous of degree k{a — 1). This gives the desired expression of Nk- 

Linear recursions. Finally, let us prove that the sequences of polynomials Nk and Dk satisfy 
a linear recurrence relation with coefficients in Q[i], the ring of polynomials in t. Equivalently, 
we prove that each of the generating functions 

N{z, t) Nkz'^ and D{z, t) := ^ Duz'' 

k>a k>a 

is actually a rational function in z and t. The existence of a Hnear recursion then easily follows 
by the general theory of rational series [36^ Ch. 4]. 

Given the expression ([5]) of , what we have to do is to evaluate 

N'{z; Ui, . . . , Ua+b) ■■= ^ S(fc_i)a-l_a_iZ*= 
k>a 

where the symmetric functions involve the a + b indeterminates ui, . . . ,u„, with n — a + b. 
We use the definition (j4]) of Schur functions to write S(i,_i-)a-i^_x as a ratio of determinants of 
size n. The determinant occurring at the denominator is the Vandermonde Vn in the m^'s, and 
is independent of k. The determinant at the numerator is obtained from ([7]) by replacing the 
column containing U^^^~^ by a column of U^^^^^ (and then each Ui by the indeterminate Ui). 
We expand it as a sum over permutations of length n, and obtain: 

N'{z: u) = ^ E E ^('^) ^ «^'^' • • • ^'^i^-^'-'^r+X ■ ■ ■ ui-K) 

k>a (T^&n 



1 



ri+a-2 ,,a+b„,a+6-l,,b-l 



Vn ^ \ \- ZUi - ■ -Ua-l 

where a acts on functions of ui, . . . , m„ by permuting the variables: 

aF{ux, ...,Un)= -F(Wct(1), ■ ■ ■ , MCT(n))- 

Equivalently, 

Q{z;u) 

where 



E (i--n«0 

:H, \i\=a-i \ iei I 



Q{z;u) = 

and P{z; u) is another polynomial in z and the Ui, symmetric in the m^'s. This symmetry property 
shows that replacing Ui by Ui transforms N'{z;u) into a rational series in z and t. The link 
between and S(fe_i)a-i then gives 

^^_ {-ir-^p{{-ir-Hz-u) 

another rational function of z and t. A similar argument, given explicitly in [10], yields 

^ {-iY-^p{{-iY-Hz-u) 

^ tQ{{^ir-Hz-U) ' 

for two polynomials P and Q in z and ui, . . . , m„. More precisely, 

Q(2;u)= ^ 
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By looking at the degree of Q and Q, this estabHshes the existence of recurrence relations of 
order ("^J) for Nk, and for Dk. If there were recursions of a smaller order, the polynomials 

Q{z; U) or Q{z] U) would factor. It has been shown in [101 Section 6] that Q{z] U) is irreducible, 
and the same argument implies that Q(z; U) is irreducible as well. ^ 

3.2.3. Two proofs of Corollary \3.4\ Let us specialize Proposition 13.31 to the case a = 1. We 
observe that /x is the empty partition, so that = 1, while A = (fc — 1), so that sx = h^-i. The 
expressions ([5]) of A'^fc and Dk in terms of Schur functions give Nk — t^""^ and Dk = t*'~^hk-i{U). 
Observe that ei{U) = 1/t and eb+i{U) = (—1)'''''^. The classical relation between elementary 
and complete symmetric functions [371 Eq. (7.13)] gives, for A: > 1, 

hkiU) ^^hk^i{U)-hk-bMU), 

with initial conditions ho = I and hi — for j < 0. This gives the desired recursion for Dk- 

Let us now justify combinatorially the simplicity of Nk and Dk- Recall that, for fc > 2, one 
has Ck = t^Wk, where Wk counts walks (with steps +1, —b) going from 1 to fc— 1 on the segment 
graph |1 , fc — 1] . The adj acency matrix of this graph is Ak . The combinatorial descriptior0 of the 
inverse of the matrix (1 — tAk) tells us that Dk counts non-intersecting collections of elementary 
cycles on the segment [1, fc — 1], while Nk counts configurations formed of a self-avoiding path 
w going from 1 to fc — 1 together with a non-intersecting collection of elementary cycles that do 
not meet w. In the polynomials Nk and ZJfe, each cycle of length I is given a weight (— t^) while 
the path w is simply weighted if it has length This gives directly Nk = t^~'^ , as the only 
possible path w is formed of fc — 2 up steps, and leaves no place to co-existing cycles. Now the 
only elementary cycles are formed of b up steps and one down step —b. The recursion satisfied 
by Dk is then obtained by discussing whether the point fc — 1 is contained in one such cycle. 

Note that this proof can be rephrased in terms of heaps of cycles using Viennot's correspon- 
dence between walks on a graph and certain heaps [38] . The expression Nk / Dk then appears as 
a specialisation of the inversion lemma (also found in [38j). In particular, Dk is the (alternating) 
generating function of trivial heaps of cycles. B 

Remark. For general values of a and 6, the description of Dk and Nk in terms of cycles and 
paths on the graph |1, fc — 1] remains perfectly valid. But the structure of elementary cycles and 
self-avoiding paths becomes more complicated. See an example in Figured 




Figure 5. Two non- intersecting elementary cycles (for a = 4 and 6 = 3). 



3.2.4. Proof of Proposition 1 3. il The expression of Ck is just a specialisation of Corollarv l3.4l 
to the case 6=1. It remains to prove that the series C{t) is not D-finite. 

Let us first observe that C{t) is D-finite if and only if the power series (in u) B(u) := 
u'^ /{I ~ n}^) is D-finite. Indeed, one goes from C{t) to B{u), and vice- versa, by an algebraic 
substitution of the variable, as U is an algebraic function of t and t = C//(l -f C/^). It is known 
that D-finite series are preserved by algebraic substitutions [37l Thm. 6.4.10], so that we can 
now focus on the series B{u). 

This series has integer coefficients, and radius of convergence 1. Hence it is either rational, 
or admits the unit circle as a natural boundary [T^- As will be recalled later ifTO]) . the singular 
behaviour of B(u) as u approaches 1 involves a logarithm, which rules out the possibility of 



^This description seems to have been around since, at least, the 80's |25U38| . See (9] Thm. 2.1] for a modern 
formulation. 
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B{u) being rational. Thus B{u) has a natural boundary, and, in particular, infinitely many 
singularities. But D-finite series have only finitely many singularities, so that B{u) is not D- 
finite. B 

4. Asymptotic enumerative results 

In this section we present some results on the asymptotic enumeration of culminating walks. 
Intuitively, three cases arise, depending on the drift of the walks, defined as the difference a — b. 
Indeed, an n-step random walk of positive drift is known to end at level 0{n) and is, intuitively, 
quite likely to be culminating. On the contrary, walks with a negative drift have a very small 
probability of staying positive. We first work out the intermediate case of a zero drift. 

4.1. Walks with a null drift (a = 6 = 1) 

When the drift is zero, the number of positive walks (walks in which every step ends at a 
positive level) of length n is known to be asymptotically equivalent to 2"/\/27m. The average 
height, and the average final level of these walks both scale like ^/n. Hence we can expect the 
number of culminating walks to be of the order of 2"/rt. This is confirmed by the following 
result. 

Proposition 4.1. As n oo, the number of (1, 1) -culminating paths of length n is asymptoti- 
cally equivalent to 2"/(4n). 

Proof. We start from the expression ^ of C{t), with U = Ui = 0{t), and apply the singularity 
analysis of [23]. Note that U{t) is an odd function of t. Let us first study the even part of C{t), 
which counts culminating paths of even length: 

A.'> 1 

Let Z = Z{x) be such that U{t)'^ = Z{t^). That is. 



z^z^.)^'-^^-^^\ 

The equation U = t{l + U^) gives Z ^ x{l + Zf. Moreover, we have Ce(t) = D{t^) where 

„ , ■. \ — Z s. — ^ 

D(x) = > ^. 

^ ' 1 + Z ^ \ - Z^^ 

k>l 

We thus need to study the asymptotic behaviour of the coefficients of D{x). We write 

1 — Z .r-^ 



D{x) = S{Z{x)), with S{z) 



1 , .- 

fe>i 



The series Z{x) has radius of convergence 1/4. It is analytic in the domain V = C\ [1/4, +00), 
with exactly one singularity, at x = 1/4. One has ^(0) = 0, and |^(a;)| < 1 for all x in T). Indeed, 
assume |^(a:;)| > 1 for some x in T). By continuity, Z{x) = e'^ for some x in T). From the equation 
x{l + Z)'^ = Z, we conclude that 9 S (— 7r,7r) (for 6 = ±7r, we would have Z = —\ = 0), and 
that X = l/(4cos^(0/2)). But this contradicts the fact that x ^V. 

The series S{z) has radius of convergence 1. Given that |^(a;)| < 1 in P, this impHes that 
D{x) = S[Z{x))) is analytic in the domain V. It remains to understand how D{x) behaves as x 
approaches 1/4 in V. 

Take x = [l ~ re'")/ A, with < r < 1 and \6\ < n. Then 

Z{x) = 1 - 2VI - 4a; + 0(1 - 4x) = 1 - l^e^"^"^ + 0[r). 

In particular, 

arg(l - Z[x)) = 6l/2 + 0(^/f). 
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Choose a G (7r/4,7r/2). The above identity shows that there exists > and 7r/2 < < tt such 
that, in the indented disk 



I = {x : |1 - 4x1 < 77 and | arg(l - 4a;)| < 0}, 



one has 



|arg(l-Z(x))| <a. (9) 
Now when z ^ 1 in such a way that | arg(l — z)\ < a, 

2fc ~ ^77^ \l°g^i ' SO that S'(z)--log- . (10) 

^-^ \ ^ z^'^ 2(1 — z) 1 — z 4 1 — z 

fe>i ^ ' 

This can be obtained using a Mellin transform or some already known results on the generating 
function of divisor sums [22] ■ 

Combining ([9]) and (fTO|) shows that, as x tends to 1/4 in the indented disk X, 

D{x)=S{Z{x))^Uog^^. (11) 

This allows us to apply the transfer theorems of [23]. Indeed, the series D{x) is analytic in the 
following domain: 

A = {x 7^ 1/4 : \Ax\ <l + ri and | arg(l - 4x)| < 0}, 

with singular behaviour near x = 1/4 given by ifTTj). From this we conclude that the coefficient 
of x" in D{x) is asymptotically equivalent to 4"/ (8n). Going back to the series Ce(i), this means 
that the number of culminating paths of (even) length TV = 2n is asymptotically equivalent to 
2^ /{AN). 

The study of the odd part of C{t) is similar. B 



4.2. Walks with positive drift (a > h) 

When the drift is positive, it is known that, asymptotically, a positive fraction of walks with 
steps +a, —h is actually positive (every step ends at a positive level). More precisely, as n 00, 
the number p"^^ of positive walks of length n satisfies 

p^'" ^ K,,b.2" (12) 

for some positive constant Ka,b- We will show that the culmination and final record conditions 
play similar filtering roles in the paths of {m,m}*, and prove the following result. 

Proposition 4.2. For a > b, the number d^^ of culminating walks of length n satisfies 

C"- = k1,.2" + 0(p"), 

where p <2 and Ka.t is the constant involved in the asymptotics of positive walks. 

Proof. In what follows, we consider two families of paths that are close to the meanders and 
excursions defined in the introduction: the (already defined) positive walks, and certain quasi- 
excursions. The exact and asymptotic enumeration of meanders and excursions has been com- 
pletely worked out in [5], and we will rely heavily on this paper. For instance, the estimate (fT2l) 
follows from the results of [5] by noticing that a meander factors into an excursion followed by a 
positive walk. Let us call quasi- excursion a walk in which every step, except the final one, ends 
at a positive level. For instance, if a = 3 and 6 = 2, the word rarfvm is a quasi-excursion. By 
removing the last step of such a walk, we see that quasi-excursions are in bijection with positive 
walks of final height 1, 2, . . . , or 6. We denote the number of quasi-excursions of length n by 
e"'''. Using the results of [5], it is easy to see that, when the drift is positive, quasi-excursions 
are exponentially rare among general walks. That is, there exists /i < 2 such that for n large 
enough, 

e'^n' < m". (13) 
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From now on, we drop the superscripts a and b, writing for instance c„ rather than c'^'^. For 
any word w ~ wi ■ ■ ■ Wk, denote by w the mirror image of w, that is, w — Wk ■ ■ ■ wi. Let u be a 
culminating word of length n, and write u = vw, where the word v (resp. w) has length [n/2j 
(resp. [n/2]). Then both v and w are positive walks, and this proves that 

Cn <P[n/2]Pln/2y (14) 

Conversely, let us bound the number of pairs {v,w), where v and w are positive walks of respective 
lengths [n/2j and ["./2] , such that the word u = vw is not culminating. This means that 

• either u factors as viWi, where vi is a quasi-excursion of length i > ln/2\, 

• or, symmetrically, u factors as V2W2 where W2 is a quasi-excursion of length j > [n/2] . 
This implies that 

n 

PL"/2jPrn/2l - Cn < 2 ^ 6,2""*. 

i=[n/2] 

In view of (fT3|) . we have, for n large enough: 

O A 

\L"/2J 



PLn/2jP[n/2l - c„ < 2 ^ f,^2-^ < T {p. j 2)^-^ ' < — i— (2/i)L 

Combining this with ifTl]) and the known asymptotics for the numbers p„ gives the expected 
result. ■ 

4.3. Walks with negative drift (a<b): exponential decay 

When the drift is negative, it is known that positive walks are exponentially rare among 
general walks. Indeed, there exist constants Ka,b > and aa,b G (1,2), such that 

,b ^a,b 



Pn ~ Ka,b- 



^3/2 ■ 

More precisely, 

1 + g 

where q = a/b < 1. We show below that the constant aa,b also governs the number of culminating 
walks of size n. 



"'^.fc = a+.r-TTT = -TZT^ = "(?)' (15) 



Proposition 4.3. For a < b, the number d^^ of culminating walks of length n satisfies 



where aa,b is given above. Moreover, 



O(^), (16) 



1- / a,b\l/" 

hm lc„' I = aab- 



Proof. The inequality (fTl|) still holds, and gives the upper bound (fT6|) on the number of culmi- 
nating paths. 

Let us now prove that the growth constant of culminating walks is still aa,b by constructing a 
large class of such walks. Let be the set of excursions of length n (from now on, we drop the 
superscripts a and b). Such excursions only exist when n is a multiple of a -I- 6, and the number 
e„ of such walks then satisfies 

for some positive constant k. It is known that random (a, 6)-excursions of length n converge 
in law to the Brownian excursion, after normalising the length by n and the height by k! \fn, 
for some constant k' depending on a and b [29]. This implies that the (normalized) height of 
a discrete excursion converges in law to the height of the Brownian excursion (described by a 
theta distribution). In particular, the probability p„ that an excursion of has height larger 
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than yjn tends to a limit p < 1 as n goes to infinity. Take an excursion of of height less than 
\fn, with 

n — 1 — \ fn 



{a + b) 



a + b 

and append one up step at its left, and n — fc — 1 up steps at its right: this gives a culminating 
walk of length n, which proves that 

Cn > efc(l -Pk)- 

Taking nth roots gives the required lower bound on the growth of c„. ^ 

Hence there are exponentially few walks of size n with steps +a, —b that are culminating. It 
is likely that c„ behaves like a"j,n~^~'^, for some 7 > that remains to be determined. Note 
that the final height of an n-step meander is known to have a discrete limit law as n — ^ 00 [5]. 

5. Random generation of positive walks 

The random generation of positive walks will be a preliminary step in some of the algorithms 
we present in the next section for the generation of culminating walks. The main ideas underlying 
the generation are the same for both classes of walks, but the class of positive walks is simpler. We 
apply three different approaches to their random generation: recursive methods (two versions), 
anticipated rejection, and Boltzmann sampling. The choice of the best algorithm depends on the 
drift, as summarized in the top part of Table El We denote by 7'"''' the language of positive 
walks, but the superscript a, b will often be dropped. 

5.1. Recursive step-by-step approach 

The first approach we present is elementary: we construct positive walks step-by-step, choos- 
ing at each time an up or down step with the right probability. This is the basis of the recursive 
approach introduced in [39]. Here are the three ideas underlying the algorithm: 

• Let yy be a language, and let Wp denote the language of the prefixes of words of W. 
Assume that for all w G Wp such that \w\ < n, we know the number Nu,{n) of words of 
W of length n beginning with w (we call these word extensions of w). Then it is possible 
to draw uniformly words of length n in W as follows. One starts from the empty word, 
and adds steps incrementally. If at some point the prefix that is built is w, one adds the 
letter x to w with probability N^x{n)/Ni^{n). 

• When W — 7"^'^, the number of extensions of length n of a prefix w G Wp depends only 
on two parameters: 

- the length difference i ^ n — \w\, 

- the final height of w, j = 0a,6(u'), 

• Let pij be the number of extensions of length n of such a prefix w. The numbers pij 
obey the following recurrence: 

Pi.j = pi-ij+a + ^]>bPi-i,j-b for i > 1, 

As the two parameters i and j are bounded by n and an respectively, the precomputation of the 
numbers pij takes 0{n^) arithmetic operations and requires to store 0{n^) numbers. Then, the 
generation of a random word of length n can be performed in linear time. However, one should 
take into account the cost due to the size of the numbers in the precomputation stage. Indeed, 
the numbers pij are exponential in n, so that the actual time-space complexity for this stage 
may grow to O(n^). However, using a fioating-point technique adapted from [l6j, it should be 
possible to take advantage of the numerical stability of the algorithm to reduce the space needed 
to 0(n2+-'). 

This naive recursive approach is less efficient than the one presented below, which is based 
on context-free grammars. But it will be easily adapted to the generation of culminating walks, 
which cannot be generated via a grammar, as was proved in Section [H 
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5.2. Recursive approach via context-free grammars 

It is easy to see that the language V""'^ = V is recognized by a non-deterministic push-down 
automaton. This implies that V is context-free. The same holds for the language 1?°'^ = I? 
of excursions. A non-ambiguous context-free grammar generating excursions is given explicitly 
in [19]. It suffices to add one equation to obtain a non-ambiguous grammar generating positive 
walks: 



In this system, e is the empty word, T> (resp. V) is the language of excursions (resp. positive 
walks) while Ci, 1 < i < a and TZj, 1 < j < b, are a + b auxiliary languages defined in [T^ . As 
above, m and to are the up and down letters in our alphabet. 

From this grammar, we can apply the recursive approach of [24] for the uniform generation of 
decomposable objects, implemented in the combstruct package of Maple or in the stand-alone 
software GenRGenS [35]. The generation of positive walks of size n begins with the precompu- 
tation of 0{n) large numbers. These numbers count words of length r, for all r < ti, in each of 
the languages involved in the grammar. The fastest way to get them is to convert the algebraic 
system lfT7|) into a system of linear differential equations, which, in turn, yields a system of 
linear recurrence relations (with polynomial coefficients) defining the requested numbers. This 
step requires a linear number of arithmetic operations. But one has to multiply numbers whose 
size (number of digits) is 0{n), which may result, in practice, in a quadratic time-complexity for 
the precomputation stage. Then, the generation of a random positive walk can be performed in 
time O(nlogn). 

Note that a careful implementation [15] of the fioating point approach of [16] using an 
arbitrary-precision fioating-point computation library yields a 0(n^+^) complexity after a 
0{n^^^) precomputation. 

5.3. Anticipated rejection 

The principle of this approach is to start with an empty walk, and then add successive up 
and down steps by flipping an unbiased coin until the walk reaches the desired length n, or 
a non-positive ordinate. In this case, the walk is rejected and the procedure starts from the 
beginning. Of course, no precomputation nor non-linear storage is required. This principle was 
applied to meanders, in the case a = 6= l,in[^, asa first step towards the uniform random 
generation of directed animals. The analysis of this algorithm yielded a linear time-complexity, 
later generaHzed in [7] to the case of coloured walks, in which up, down, and level steps come 
respectively in p, q and r different colours. There, it was shown that the time-complexity is 
linear when p > q, but exponential when p < q. 

Unsurprisingly, we obtain similar results for the general (a, 6)-case. 

Proposition 5.1. The anticipated rejection scheme applied to the uniform random generation 
of {a, b) -positive walks has a linear time- complexity when a > b and an exponential complexity 
in Q{{2/aa,bYny/n) when a <b, with Oafi = < 2. 



Proof. We first note that the language V of positive walks is a left-factor language. That is, it 
is stable by taking prefixes, and every word of V is the proper prefix of another word of 7^. It 
has been proved in [H] that the average complexity fc{n-) of the anticipated rejection scheme 
for a left-factor language £ on a fc-letter alphabet is 



where L{z) is the length generating function of the words of C. 

We now exploit the results of [5], giving the singular behaviour of the series M{z) and E{z) 
that count respectively meanders and excursions. As a meander factors uniquely as an excursion 



2? = £ + ELl'Cfe7^fe, 




(17) 
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followed by a positive walk, we can derive from [5] the singular behaviour of the series P{z) = 
'^PnZ'^ that counts positive walks. This series is always algebraic, so that singularity analysis 
applies. 

- For a > b, the series P{z/2) has an algebraic singularity at 2 = 1 in (1 — z)^" (with 
V = 1 if a > b, and = 1/2 if a = b). Thus P(z/2)/(l — z) has a singular behaviour in 
(1 — z)""^"^. A singularity analysis gives fv{n) ~ n/i/. 

- For a < b, the series P{z/2) has a square-root singularity at > 1, but P(z/2)/(l — 
z) has a smaller radius of convergence Zc — 1, with a simple pole at this point. This 
gives 



for some constant k. 



5.4. BOLTZMANN SAMPLING 

A Boltzmann generator [20] generates every object in the class C with a probability propor- 
tional to x", where n is the size of the object. More precisely, for every object w (a walk, in our 
context): 



where C{x) is the generating function of the objects of C. Of course, this results in a relaxation 
of the size constraint, since objects of all sizes can be generated. But, by tuning carefully the 
parameter x (which has to be smaller than or equal to the radius of convergence of C{x)), and 
rejecting the too large and too small objects, one can often achieve an approximate-size random 
sampling, with a tolerance e, in linear time. This means that after a linear number of real- 
arithmetic operations, and a number of attempts that is constant on average, the algorithm will 
produce an object of size \w\ G [(1 — e)n, (1 -I- e)n], which is uniform among the objects of the 
same size. 

In particular, the grammar (fTTI shows that the class of positive walks is specifiable in the 
sense of [20]. The analysis of the generating functions of meanders and excursions performed 
in [5] shows that the series P{z) counting positive walks is always analytic in a A-domain, with 
a dominant singularity in (1 — /^t)"", where v = 1 if a > b, v ~ 1/2 if a = 6 and v = —1/2 
if a < 6. In the first two cases. Theorem 6.3 of [20] gives an approximate sampling in linear 
time (and an exact sampling in quadratic time). In the third case, the standard deviation of 
the objects produced by a standard Boltzmann sampler is much larger than their mean, which 
makes rejection costly. However, we can generate instead pointed positive walks, that is, positive 
walks with a distinguished step, and forget the pointing: as guaranteed by Theorem 6.5 of |20) . 
this gives again an approximate sampling in linear time. 

To conclude, the uniform random generation of (a, 6)-positive walks of size n can be performed 
in linear time when a > 6 by an anticipated rejection, and this strategy does not require any 
precomputations nor storage. When a < b, our best algorithm for exact sampling remains the 
recursive approach based on the grammar (fT7|) . It runs in 0(ri,^+') after a 0(n^~^^) precompu- 
tation. However, one can achieve, in linear time and space, an approximate-size sampling using 
a Boltzmann generator. 



6.1. Recursive step-by-step approach 

This elementary procedure, introduced in [3^, generates culminating walks step by step, 
choosing every new step with the right probability. This is again an instance of Wilf 's recursive 
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method. The arguments given in Section ISTTl for positive walks should now be replaced by the 
following ones: 

• For W = C"'*', the number of extensions of length n of a prefix w E Wp depends only on 
three parameters: 

- the length difference i = n — 

- the final height j = (f>a,b{'w), 

- the maximal height h reached by w. 

• Let Cij^h be the number of extensions of length n of such a prefix w. The numbers Cij^h 
obey the following recurrence: 

C'iJM = Ci_i j^a.,max{h,j+a) + '^j>b Ci-lJ-b,h for Z > 1, 

As the parameters i, j and h are bounded by n, an and an respectively, the precomputation of 
the numbers c{i,j,h) takes 0{n'^) arithmetic operations and requires to store 0{n^) numbers. 
Then, the generation of a random word of length n can be performed in linear time. But 
again, the numbers Cij^h are exponential in n, so that the actual time-space complexity of the 
precomputation stage may grow to 0{n^). 

The above procedure is easily adapted to generate culminating walks ending at a prescribed 

(k) 

height k. The number J of z-step extensions of a prefix ending at height j is given by 

(fc) 11 (k) ,1, (fe) f ■ ^ 1 

(k) 

Now j is bounded by k, so that we only have to compute a table of 0{kn) numbers, in 0{kn) 
arithmetic operations. The actual time-space complexity is likely to grow to 0{kn'^) due to the 
handling of large numbers. 

However, whether the height of the walk is fixed or not, one should be able to limit the com- 
putational overhead due to the size of these numbers to 0{n^), using a fioating-point technique 
adapted from |16) . 

6.2. Rejection methods 

We presented in Section 15.31 an example of the anticipated rejection approach. The more 
general rejection principle has been applied successfully to various problems [TTl [U [20]. The 
principle of a rejection algorithm for words in W is to draw objects uniformly in a superset 
V D W until an object of W is found. The average-case complexity of a such a technique is then 
C{n)vn/wn, where (^{n) is the cost for the generation of a word of size n in V, and Wn and Vn 
respectively denote the number of words of length n in W and V. 

The aim is to find a superset V satisfying the following (sometimes confiicting) requirements: 

- the words of V can be generated quickly, so that ({n) is small, 

- the set V is not too large, so that the ratio w„/w„ is small. 

Moreover, testing whether a word of V actually belongs to W should be doable in linear time. 
This is obviously the case when W = C"'*'. 

We investigate below two possibilities for the superset V, while fixing W = C^'^. 

6.2.1. Drawing from positive walks. Here, we take for V the set of positive walks. Their 
random generation has been discussed in Section O and we refer to the last lines of this section 
for our conclusions on this question. 

- When a < b, the number Vn of positive walks of length n grows like a" h^'^^"^ (^P a 
multiplicative constant). If c°'^ grows like a" jn~^~^ for 7 > (see Proposition 14. 3|) . the 
cost will be 0[v?^^l'^^'^), with a preprocessing stage of 0(n^+^). However, approximate- 
size sampling can be performed in time 0[rP^^I'^\ with no preprocessing stage. It 
suffices to reject among the set of positive walks generated by a Boltzmann algorithm. 
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- If a = 6, then w„ grows like 2"ri,~^/^, while c„ ~ 2"/n (Proposition 14. ip . Hence the cost 
here is 0(n^/^). 

- Finally, for a > b, the number of culminating walks grows like 2" (Proposition HUl)- This 
shows that the algorithm is linear. 

Remark. For a > b, culminating walks are so numerous that we can even perform the rejec- 
tion in the set of general (a, 6)-walks, and still obtain a linear complexity, as discussed in the 
introduction. However, it seems natural to perform an anticipated rejection, rejecting walks as 
soon as they stop being positive: but this amounts to performing rejection in the set of positive 
walks, obtained themselves via an anticipated rejection from general walks. 

6.2.2. Drawing from hybrid walks. We begin with a simple, yet crucial, observation: 

Let w denote the mirror image of the word w. Then if € C^'^, so is if. 

Graphically, taking the mirror image amounts to a central symmetry on walks. This remark 
implies that, on average, the mid-point of a culminating walk lies at a height which is half the 
final height. This suggests another possible superset of C"'*' from which we may draw, namely 
the language W''' of hybrid walks, defined by 

Ti. = TY"''' := [J 'Pyn/2\'P\n/2\^ 
n>0 

where V is the language of positive walks, and V the language of mirror images of positive 
walks. As already observed in Section HI C""'' C H"-''. 

The intuition behind the choice of the superset W''' is that a path that violates the positivity 
(resp. final record) condition is likely to do so at its beginning (resp. ending). Thus, ensuring 
positivity on the first half of the walk, and the final record condition on the second half, should 
yield a lower rejection probability than ensuring positivity everywhere, as we did when drawing 
from positive walks. 

How can one generate hybrid walks uniformly at random? As a hybrid walk of length n is 
the (non-ambiguous) concatenation of a positive walk of size [n/2j and of the mirror image of 
another positive walk, of size [«./2], it is sufficient to draw positive walks uniformly at random. 
The cost of the generation of a hybrid walk of length n will be twice the cost of the generation 
of a positive walk of length (approximately) n/2. We refer again to the end of Section [5] for 
our conclusions on this cost. We do not use below the Boltzmann sampHng for positive walks, 
since gluing two positive walks of approximate size n/2 does not give the same probability to all 
hybrid walks of a given size. 

Let us now discuss the efficiency of the rejection approach based on the language H. 

- When a < 5, we have \Hn\ = 6(a"j,/n'^), while to„ = 6(q;" j,/n'^/^), so that we gain an 
order 0{n^^^) in complexity (comparing with the rejection of positive walks). This leads 
to a cost 0(n'^+-^+^) if c„ scales like a" f,""'^"''', with a 0(n^+^) precomputation. 

- When a = b = 1, \Hn\ ~ 6(2"/n), while m„ = 6(2"/%/^), so that the gain is of order 
^/n. Consequently, the complexity of the rejection algorithm based on Tl is linear. No 
precomputation nor storage is required. 

- For a > 6, we have |7in| = 0(2"), and similarly to„ = 0(2"). So the complexity gain 
(compared with the approach that generates positive walks) can only be 6(1). The 
algorithm is still Hnear. 

7. Conclusion and perspectives 

We have studied culminating paths, from the point of view of formal languages, enumerative 
combinatorics and random generation. Our best results in terms of random generation are 
summarized in Table [2j 

An important question that is left open is to determine the asymptotic growth of the number of 
culminating walks when the drift is negative {a < b). One possible approach would be to exploit 
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Paths 


Method 


U Attempts 


Precomp. 


Cost 


■pa,b 

a < b 


Recursive method, Section [Ol 
standard implementation 
or floating-point implementation. 
Approximate-size Boltzmann, 

LjeLliOll lO.^I 


n(^ \ 


0(n2) 
0(?? ^^^) 

ffi 


0{n \ogn) 
uyn) 


a>b 


Anticipated rejection, 
Section [Ql 


O(V^) {a = b) 
0(1) (a>6) 





0{n) 




Recursive method. 
Section Id. 11 






0{n) 


Qa.b 

a < b 


Recursive method, [30] and Section \6A\ 
or rejection from hybrid walks. 
Section 16.2.21 






0{n) 


a = b 


Rejection from 
hvbrid walks. Section |6. 2. 2| 


0(1) 





0{n) 


a> b 


Rejection from positive walks or 
hvbrid walks. Sections |6.2.1| and 16.2.21 


0(1) 





0{n) 



Table 2. The complexity of random generation of positive and culminating 
paths. The cost is that of one random drawing, once the precomputations have 
been performed. It is assumed that c„ ~ a" i,n^^ ^ \{ a <b. 



the closed form expression of Proposition 13. 3| in the spirit of Proposition 14.11 and [5]. The result 
might have interesting consequences regarding the random generation of culminating walks. In 
particular, if d^^ ~ Q{{m'^^^Y n^'^) ~ 9(a" with 7 < 2, the generation algorithm 

based on hybrid walks would be faster than the recursive algorithm, at least for generating few 
paths. However, our numerical data suggest that the ratio d^^ / decreases at least as fast 
as n^^. 

It would also be interesting to study how the height is distributed on random culminating 
walks of length n. Such a study may provide better algorithms for random generation, especially 
in the a < b case, where the height is expected to be small. How does the average height 
scale with n? Is there a Hmiting distribution for some normaHzed height? This is related 
to a more ambitious question: is there a Hmiting process for culminating walks, in the same 
way discrete excursions converge to the Brownian excursion [29], or discrete meanders to the 
Brownian meander [28]? In the case a = 6 = 1, a candidate for the limit process could be 
the meander conditioned (with care) to reach its maximum at time 1. Note that the joint law 
of the maximum and final position of a meander is known [21], and related to the law of the 
maximum and minimum of a Brownian bridge, both in the continuous and discrete cases |13) . 
The case where the maximum coincides with the final position (an event of zero probability in 
the continuous case) is closely related to our culminating walks. 

Future extensions of the present work may also include the study of culminating walks with 
more than two types of steps, in order to model different kinds of matches and mismatches, 
and thus capture the whole scoring scheme of the FLASH algorithm. For instance, it is usually 
considered less drastic to replace a purine base by another purine base (A^G) rather than a 
pyrimidine one in DNA. It is thus natural to penalize differently different mismatches. This 
could be modelled by introducing down steps of different heights. 

Lastly, a natural, biologically relevant perspective would be to address the non-uniform gen- 
eration of culminating paths. Indeed, the matches and mismatches may not be uniform over 
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a biological sequence, and be subject to local correlations. This is classically modelled by a 
Markov chain (further conditioned to yield culminating paths). Our algorithms could in prin- 
ciple be adapted to this more general context, but their analysis would need to be carefully 
worked out. In particular, the drift of random walks would depend on the chain and differ in 
general from a—b. We naturally expect the efficiency of our algorithms to depend of the model, 
culminating walks with positive drift being much easier to generate than those with a negative 
drift. 
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Bertoin, Philippe Chassaing, Jean-Frangois Le Gall and Svante Janson for discussions on the 
possible limiting process of culminating paths. 

References 

[1] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. J. 

Molecular Biology, 215(3):403-410, 1990. 
[2] A. Ayyer and D. Zeilberger. Two dimensional directed lattice walks with boundaries. ArXiv cond- 

mat/0701674. 

[3] C. Banderier. C'ombinatoire analytique des chemins et des cartes. PhD thesis, Universite Paris 6, 2001. 

[4] C. Banderier, M. Bousquet-Melou, A. Denise, P. Flajolet, D. Gardy, and D. Gouyou-Beauchamps. Generating 

functions for generating trees. Discrete Math., 246(l-3):29-55, 2002. 
[5] C. Banderier and P. Flajolet. Basic analytic combinatorics of directed lattice paths. Theoret. Comput. Sci., 

281(l-2):37-80, 2002. 

[6] E. Barcucci, R. Pinzani, and R. Sprugnoli. The random generation of underdiagonal walks. In P. Leroux and 
C. Reutenauer, editors. Proceedings of 4th Conference on Formal Power Series and Algebraic Combinatorics 
(FPSAC'92). Universite du Quebec a Montreal, 1992. 

[7] E. Barcucci, R. Pinzani, and R. Sprugnoli. The random generation of directed animals. Theoret. Comput. 
Sci., 127(2) :33,3-350, 1994. 

[8] M. Bousquet-Melou and M. Petkovsek. Linear recurrences with constant coefficients: the multivariate case. 
Discrete Math., 225(l-3):51-75, 2000. 

[9] M. Bousquet-Melou. Rational and algebraic series in combinatorial enumeration. In Proceedings of the In- 
ternational Congress of Mathematicians, pages 789-826, Madrid, 2006. European Mathematical Society 
Publishing House. 

[10] M. Bousquet-Melou. Discrete excursions. Seminaire Lotharingien de C'ombinatoire, 57, 2008. (electronic) 

Article B57d, 23 pp. ArXiv math.CO/0701171. 
[11] A. Califano and I. Rigoutsos. Flash: A fast look-up algorithm for string homology. In Proceedings of the 1st 

International Conference on Intelligent Systems for Molecular Biology, pages 56-64. AAAI Press, 1993. 
[12] F. Carlson. Uber Potenzreihen mit ganzzahligen KoefBzienten. Math. Z., 9(1-2):1-13, 1921. 
[13] E. Csaki and S. G. Mohanty. Excursion and meander in random walk. Canad. J. Statist., 9(l):57-70, 1981. 
[14] A. Denise. Generation aleatoire et uniforme de mots. Discrete Math., 153:69-84, 1996. 

Jl5] A. Denise, I. Dutour, and P. Zimmermann. CS: a MuPAD package for counting and randomly generating 
combinatorial structures. In Proceedings of 10th Conference on Formal Power Series and Algebraic Combi- 
natorics (FPSAC'98), pages 195-204, 1998. Also published in MathPAD 8 (1) 1998. 

[16] A. Denise and P. Zimmermann. Uniform random generation of decomposable structures using floating-point 
arithmetic. Theoret. Comput. Sci., 218:233-248, 1999. 

[17] L. Devroye. Non-Uniform Random Variate Generation. Springer- Verlag, New York, 1986. 

Jl8] P. Di Francesco, E. Guitter, and C. Kristjansen. Integrable 2D Lorentzian gravity and random walks. Nuclear 
Phys. B, 567(3):515-553, 2000. 

Jl9] P. Duchon. On the enumeration and generation of generalized Dyck words. Discrete Math., 225(1-3):121-135, 
2000. 

J20] P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer. Boltzmann samplers for the random generation of 

combinatorial structures. Combin. Probab. Comput., 13(4-5):577-625, 2004. 
J21] R. T. Durrett and D. L. Iglehart. Functionals of Brownian meander and Brownian excursion. Ann. Probability, 

5(1):130-135, 1977. 

J22] P. Flajolet, X. Gourdon, and P. Dumas. Mellin transforms and asymptotics: harmonic sums. Theoret. 

Comput. Sci., 144(l-2):3-58, 1995. 
J23] P. Flajolet and A. Odlyzko. Singularity analysis of generating functions. SIAM J. Discrete Math., 3(2):216- 

240, 1990. 

J24] P. Flajolet, P. Zimmerman, and B. Van Cutsem. A calculus for the random generation of labelled combina- 
torial structures. Theoret. Comput. Sci., 132(l-2):l-35, 1994. 



CULMINATING PATHS 



23 



[25] D. Foata. A noncommutative version of the matrix inversion formula. Adv. in Math., 31(3):330-349, 1979. 
[26] I. M. Gessel. A factorization for formal Laurent series and lattice path enumeration. J. Combin. Theory Ser. 
A, 28(3):321-337, 1980. 

]27] J. E. Hopcroft and J. D. UUman. Formal languages and their relation to automata. Addison- Wesley, 1969. 
J28] D. L. Iglehart. Functional central limit theorems for random walks conditioned to stay positive. Ann. Prob- 
ability, 2:608-619, 1974. 

J29] W. D. Kaigh. An invariance principle for random walk conditioned by a late return to zero. Ann. Probability, 
4(1):115-121, 1976. 

J30] G. Kucherov, L. Noe, and Y. Ponty. Estimating seed sensibility on homogenous alignments. In IEEE, editor. 
Proceedings of Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE04), page 387, 2004. 
J31] J. Labelle. Langages de Dyck generalises. Ann. Sci. Math. Quebec, 17(l):53-64, 1993. 
132] J. Labelle and Y. N. Yeh. Generalized Dyck paths. Discrete Math., 82(l):l-6, 1990. 

133] G. Louchard. Asymptotic properties of some underdiagonal walks generation algorithms. Theoret. Comput. 
Sci., 218(2):249-262, 1999. 

134] W. R. Pearson and D. J. Lipman. Improved tools for biological sequence comparison. Proceedings of the 

National Academy of Sciences of the USA, 85:2444-2448, 1988. 
135] Y. Ponty, M. Termier, and A. Denise. GenRGenS: Software for generating random genomic sequences and 

structures. Bioinformatics, 22(12):1534-1535, 2006. 
136] R. P. Stanley. Enumerative combinatorics. Vol. 1, volume 49 of Cambridge Studies in Advanced Mathematics. 

Cambridge University Press, Cambridge, 1997. 
137] R. P. Stanley. Enumerative combinatorics. Vol. 2, volume 62 of Cambridge Studies in Advanced Mathematics. 

Cambridge University Press, Cambridge, 1999. 
138] G. X. Viennot. Heaps of pieces. I. Basic definitions and combinatorial lemmas. In Combinatoire enumerative 

(Montreal, Quebec, 1985), volume 1234 of Lecture Notes in Math., pages 321-350. Springer, Berlin, 1986. 
139] H. S. Will. A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects. 

Advances in Math., 24(3):281-291, 1977. 

CNRS, LaBRI, Universite Bordeaux 1, 351 cours de la Liberation, 33405 Talence Cedex, France 
AND LRI, Bat 490 Universite Paris-Sud 91405 Orsay Cedex France 
E-mail address: mireille.bousquet@labri.fr and yann.ponty@lri.fr 



